Data biography: Ku Klux Klan (KKK) Ledgers in the Greater Denver area¶

The Ku Klux Klan (KKK) is one of the most famous white supremacist organizations in the history of the United States, which was first founded in 1865 after the end of the Civil War, advocating to defend the interests of whites and oppose the freedom and civil rights of blacks and immigrants. The group reached the height of its expansion in the 1920s, with members infiltrating local government, law enforcement and other organizations.

Colorado in the 1920s was one of its active areas, and the core object of this dataset is the Ku Klux Klan Ledgers from Greater Denver (1924-1926). This is a historical archive that is publicly available at History Colorado, and the dataset contains detailed information on the Denver area KKK members and their associates during 1924-1926, including names, addresses, and telephone numbers etc.


Who?¶

These ledgers were originally collected by the Ku Klux Klan in the 1920s for the organization's membership management and activity arrangements in the Denver area. At the time, the Ku Klux Klan was so powerful in Colorado that they manually entered detailed member information, such as names, addresses, and contact information, into the ledger for internal administrative use.

The physical ledger was anonymously donated to Colorado History in 1946 through a staff member of the Rocky Mountain News. And then History Colorado has preserved the ledgers and digitizing them. The museum makes the ledger public, along with its digital form, for the public to view and download on its website.

Above is a screenshot of the original ledger scanning PDF file, you can see its various records clearly displayed.

When & Where?¶

The information was collected between 1924 and 1926 by the Ku Klux Klan in and around the Greater Denver area. The ledger records reflect the organization's extensive social penetration and organizational management. In 2021, the Colorado History Museum digitized the data and made it available to the public and researchers as part of its historical archives.

In [30]:
import pandas as pd
kkk_df = pd.read_csv('kkk-ledgers-index.csv', low_memory=False)
In [32]:
import plotly.express as px
city_counts = kkk_df['residenceCity'].value_counts().nlargest(11).reset_index()
city_count = city_counts.tail(10)
fig = px.bar(
    city_count,
    x='residenceCity',
    y='count',
    title='Number of persons residing outside Denver',  
    labels={'residenceCity': 'City', 'count': 'Count'}  
)
fig.show()
In [33]:
pd.set_option('display.max_columns', None)
sample_df = kkk_df[['fullName', 'Business Address']].dropna().sample(20)
sample_df
Out[33]:
fullName Business Address
5278 Bruce H Foster 23rd & Blake
1116 Wm H Hedley, Jr. 1636 Champa
10816 Allison R Beckmann 124 W 14th Ave
2101 Chas H Burnham 1811 Glenarm St.
8929 Clarence Edwin Fraser 1200 Bannock
3033 Lyle Abram Holland 2737 W 27th Ave
8355 Jack C Dister 1408 Curtis
5738 Darrell W Meacham 2900 Welton
16250 Wm W Fritts 1119 18th St
7543 Fred'k C Butterfield 1632 Court Pl
3500 Valentine Emil Joerger 1916 Blake
12183 Wm A Trowbridge 515 Kittredge Bldg, 16th & Glenarm*
14040 Frank Henson 3825 Tennyson
6200 M J Roberts 1541 Lincoln
8090 Roy S Walters 547 Galapago
14310 Andrew C Campbell 2424 East Colfax
4905 Theodore F Clark, Jr. 736 14th St
9005 Lucion Grady Hubbard 516 Foster Bldg, 16th & Champa*
6915 Henry M Duff 1736 Platte St
9927 John Henry Long 17th & Broadway*

The above two charts show the influence of the KKK at that time, in the Denver area alone, they had infiltrated various organizations and institutions, and expanded beyond Denver.


How?¶

These ledgers were originally recorded by manual writing and included different personal information. Since its use is primarily for internal management, this information has a great level of detail.

The post-processing process includes scanning the ledger into PDF images, using OCR for text recognition, and then manually reviewing and converting to CSV format for easy data analysis. The History Colorado provides viewing of PDF images and CSV files.

In [40]:
non_counts = kkk_df.notnull().sum().sort_values(ascending=False).tail(23).reset_index()
non_counts.columns = ['1', 'NonNullCount']
fig = px.bar(
    non_counts,
    x='1',
    y='NonNullCount',
    title='Data filling status', 
    labels={'1': '', 'NonNullCount': 'count'}  ,
     color='NonNullCount',                    
    color_continuous_scale='Inferno_r',  
    template='plotly_white'            
)
fig.show()

We can see that in addition to recording names, the amount of other member-related records in the ledger was almost halved, which shows that the KKK focused on names when recording this ledger, and addresses and other information were probably not considered.

This may also be because some people are reluctant to give out their personal information (after all, they are joining an unofficial organization). This just goes to show that the Ku Klux Klan, as a racist organization, doesn't need much information to keep its members in touch (you can even become a member just by filling in your name).


Why?¶

The original intention of the Klan to collect this data may be to manage internal organization, collect dues, and monitor social networks, hoping to strengthen organizational influence through the management of members.

In 21th Century, History Colorado has shifted its overt purposes to education, historical transparency and social reflection. By exposing these historical materials related to racism, the public can better understand the social impact of extremism and provide real and powerful material support for social education.

Description of image
Image of the Ku Klux Klan holding rallies show its iconic symbol: a burning cross

The original ledger can be seen at the Colorado History Center, and on the internet, these data are stored in two forms: one is a PDF image file, which retains the original appearance of the ledger, which is easy to historical comparison and intuitive reading. The second is CSV file, suitable for structured analysis.

Data fields include full names, addresses, phone numbers, business addresses, member numbers, ledger page numbers, and supplementary fields such as "symbolExist" and "Note & Remarks". The author uses the CSV format file provided by History Colorado, and uses Python language to read and analyze in Jupyter Notebook.

Get Access To Data

Resources accessible on the Web

The dataset has some shortcomings in record integrity, as noted on the website, with the first 69 records missing and large empty values in address and telephone information. Also, colum such as "Notes & Remarks" use many abbreviations or specific code names and may require more specialized historical context for people to understand.

The data cover only 1924-1926 and are spatially limited to the greater Denver area, so it may not be a complete picture of the Klan nationwide.

In [51]:
notes = kkk_df[['fullName', 'Notes & Remarks']].dropna().sample(n=10, random_state=42)
notes.reset_index
notes
Out[51]:
fullName Notes & Remarks
440 Harry R Miller Name struck; "DECEASED"
10059 Harry Ellsworth Meloeny Name struck; "DECEASED"
21001 Leslie E Keithline Arvada - Paid Kerk Aug. 8, 1924 #36
26731 George H Goulden Englewood to Brock 10/18
4439 Roy E Merritt Name struck; "RESIGNED 6-15-26"
28311 George A Zuber Rejected ref. 114506
22593 William C Callahan Rejected; Ref 9/18 142127
17049 John R Buckwalter Says he can't go there; Returned his own check
21491 Terry J Miller Littleton - Paid Kerk Aug. 8, 1924 #36
26372 Ames A Martin Rejected ref. 113529

As you can see, in 10 randomly selected lines, the information contained in the Notes & Remarks is almost completely unintelligible.


One interesting point in this ledger is that it has a column that counts whether there is a symbol on each member(symbolExist), which I think can be used to determine which KKK members are real and which are just related.

In [56]:
symbol_counts = kkk_df['symbolExists'].value_counts().reset_index()
symbol_counts.columns = ['Symbol', 'Count']

fig = px.pie(symbol_counts, names='Symbol', values='Count',
             title='People with Symbol')

fig.show()

According to the chart above, the number of members with symbols is only 25.4%, which may indicate that there are not that many truly fanatical members of the Ku Klux Klan, and their number is greatly overestimated.

The Ku Klux Klan was a racist, xenophobic organization with deep historical roots that became a political force in several states in the United States during the 1920s. They oppose blacks, Jews, Catholics, immigrants, gays and other marginalized groups. When using this data, we must be wary of its political and racist tendencies and avoid inflicting secondary damage on the victims of history. And also, while the disclosure of data can be educational, it can also raise ethical and privacy concerns for the descendants of the people in the ledger. This data should be handled with respect and caution.

The madness of the KKK

This dataset is meant to help people understand the organization and operation of the Ku Klux Klan in the 1920s, and shows that the collection and publication of data was never neutral. This data presents both a piece of the history of extremist groups and an important source of information for today's society as it confronts issues of discrimination, hatred and historical justice. People should continue to excavate the social structure hidden behind the data, strengthen the memory of historical injustice and the awareness of resistance, and make the data truly serve the goal of social progress, fairness and justice.